Partial LLL Reduction 



Xiaohu Xie 
School of Computer Science 
McGill University 
Montreal, Quebec, Canada H3A 2A7 
Email: xiaohu. xie @mail.mcgill.ca 



Xiao-Wen Chang 
School of Computer Science 
McGill University 
Montreal, Quebec, Canada H3A 2A7 
Email: chang@cs.mcgill.ca 



Mazen Al Borno 
Department of Computer Science 
University of Toronto 
Toronto, Ontario, Canada M5S 2E4 
Email: mazen@dgp.toronto.edu 



(N ■ 

o : 
<\ 

\Q ■ 



o 



> 

00 

m 
o 



X 



Abstract — The Lenstra-Lenstra-Lovasz (LLL) reduction has 
wide applications in digital communications. It can greatly 
improve the speed of the sphere decoding (SD) algorithms 
for solving an integer least squares (ILS) problem and the 
performance of the Babai integer point, a suboptimal solution 
to the ILS problem. Recently Ling and Howgrave-Graham 
proposed the so-called effective LLL (ELLL) reduction. It has less 
computational complexity than LLL, while it has the same effect 
on the performance of the Babai integer point as LLL. In this 
paper we propose a partial LLL (PLLL) reduction. PLLL avoids 
the numerical stability problem with ELLL, which may result in 
very poor performance of the Babai integer point. Furthermore, 
numerical simulations indicated that it is faster than ELLL. We 
also show that in theory PLLL and ELLL have the same effect 
on the search speed of a typical SD algorithm as LLL. 

I. Introduction 

In a multiple-input and multiple-output (MIMO) system, 
often we have the following linear model: 



y = Hx + v, 



(1) 



where y 6 M n is the channel output vector, v 6 R ra is the 
noise vector following a normal distribution Af(0, a 2 1), H G 
jgmxm j s jjjg cnanne ] matrix, and x G Z m is the unknown 
integer data vector. In some applications where complex ILS 
problems may need to be solved instead, we can first transform 
the complex ILS problems to equivalent real ILS problems. 
For simplicity, like [1], in this paper we assume m = n and 
H is nonsingular. 

To estimate x, one solves an integer least squares problem 



min \\y - Hx\\ 2 , 

x£Z n 



(2) 



which gives the maximum-likelihood estimate of x. It has 
been proved that the ILS problem is NP-hard [2]. For applica- 
tions which have high real-time requirement, an approximate 
solution of (2) is usually computed instead. A often used 
approximation method is the nearest plane algorithm proposed 
by Babai [3] and the produced approximate integer solution 
is referred to as the Babai integer point. In communications, 
a method for finding this approximate solution is referred to 
as a successive interference cancellation decoder. 

A typical method to solve (2) is a sphere decoding (SD) 
algorithm, such as the Schnorr-Euchner algorithm (see [4] and 
[5]) or its variants (see, e.g., [6] and [7]). A SD algorithm 
has two phases. First the reduction phase transforms (2) to 
an equivalent problem. Then the search phase enumerates 



integer points in a hyper-ellipsoid to find the optimal solution. 
The reduction phase makes the search phase easier and more 
efficient. The Lenstra-Lenstra-Lovasz (LLL) reduction [8] is 
the mostly used reduction in practice. An LLL reduced basis 
matrix has to satisfy two conditions. One is the size-reduction 
condition and the other is the Lovasz condition (see Section 
II for more details). Recently Ling and Howgrave-Graham [1] 
argued geometrically that the size-reduction condition does not 
change the performance of the Babai integer point. Then they 
proposed the so-called effective LLL reduction (to be referred 
to as ELLL) which mostly avoids size reduction. They proved 
that their ELLL algorithm has less time complexity than the 
original LLL algorithm given in [8]. However, as implicitly 
pointed out in [1], the ELLL algorithm has a numerical 
stability problem. Our simulations, presented in Section V, 
will indicate that ELLL may give a very bad estimate of x 
than the LLL reduction due to its numerical stability problem. 

In this paper, we first show algebraically that the size- 
reduction condition of the LLL reduction has no effect on 
a typical SD search process. Thus it has no effect on the 
performance of the Babai integer point, the first integer point 
found in the search process. Then we propose a partial 
LLL reduction algorithm, to be referred to as PLLL, which 
avoids the numerical stability problem with ELLL. Numerical 
simulations indicate that it is faster than ELLL and is as 
numerically stable as LLL. 

II. LLL Reduction 

In matrix language, the LLL reduction can be described as 
a QRZ factorization [9]: 

Q T HZ = R, (3) 

where Q G R" x ™ is orthogonal, Z e Z" xn is a unimodular 
matrix (i.e., det(Z) = ±1), and R G R" x ™ is upper triangular 
and satisfies the following two conditions: 



< r. 



% 1 < i < j < n 



Sr 2 



<rf 



(4) 



. - 'i_i,i + r M , Ki<n, 

where the parameter S G (1/4, 1]. The first condition in (4) is 
the size-reduction condition and the second condition in (4) is 
the Lovasz condition. 

Define y = Q T y and z = Z~ x x. Then it is easy to see 
that the ILS problem (2) is reduced to 

2 



mm \y ■ 



Rz\ 



(5) 



If z is the solution of the reduced ILS problem (5), then x = 
Z'z is the ILS solution of the original problem (2). 

The LLL algorithm first applies the Gram-Schmidt orthog- 
onalization (GSO) to H, finding the QR factors Q and R 
(more precisely speaking, to avoid square root computation, 
the original LLL algorithm gives a column scaled Q and a 
row scaled R which has unit diagonal entries). Two types of 
basic unimodular matrices are then implicitly used to update 
R so that it satisfies (4): integer Gauss transformations (IGT) 
matrices and permutation matrices, see below. 

To meet the first condition in (4), we can apply an IGT: 

Z ij = 1 - C e i e j- 

where ej is the z-th column of I n . It is easy to verify that 
is unimodular. Applying Z^ (i < j) to R from the right 
gives 

R — RZ \j = R — (^Rb^bJ , 

Thus R is the same as R, except that f k j = r k j — (r ki , k = 
By setting £ = |r».;/ r *»l> the nearest integer to 
Tij/ra, we ensure \fij\ < \ f u \/2. 

To meet the second condition in (4), we permutations 
columns. Suppose that we interchange columns i — 1 and i 
of R. Then the upper triangular structure of R is no longer 
maintained. But we can bring R back to an upper triangular 
matrix by using the GSO technique (see [8]): 

R = Gi-l,iRPi-l,i, 

where Gj-i^ is an orthogonal matrix and Pj-i^ is a permu- 
tation matrix. Thus, 

= r i-l,i + r M' (6) 

-2 , —2 2 

r i-l,i "+" r i.i — * »— 1 ■ 

If Srf_i i _ 1 > rf_ 1 j + r 2 i , then the above operation guaran- 
tees *f!. M _i < »f-i,i+'r?i- 

The LLL reduction process is described in Algorithm 1. 

III. SD Search Process and Babai Integer Point 

For later use we briefly introduce the often used SD search 
process (see, e.g., [7, Section II. B.]), which is a depth-first 
search (DFS) through a tree. The idea of SD is to search for the 
optimal solution of (5) in a hyper-ellipsoid defined as follow: 

\\y-Rz\\l <0. (7) 

Define 

" (8) 
Cfe = (Vk - X] r k] z j)/ r kk, fc = n-l,...,l. 

j=k+i 

Then it is easy to show that (7) is equivalent to 

n 

level fc : r 2 kk (z k - c k f < /? - ^ r%{ Zj - c,) 2 , (9) 

j=k+i 

where k = n, n — 1, . . . , 1. 



Algorithm 1 LLL reduction 

1: apply GSO to obtain H = QR, 

2: set Z = I n , k = 2; 

3: while Kudo 

4: apply IGT Z k ~i,k to reduce rk-i,k- R = RZ k -\.k\ 

5: update Z: Z = ZZ k -i >k ; 

6: if * ^_ lifc _! > (r 2 k _ l k + r 2 kJ )j then 

7: permute and triangularize JF2: R=Gk-\,kRPk-i,k\ 

8: update Z. Z = ZP k ^, k ; 

9: k = k - 1, when fc > 2; 

10: else 

11: for i = fc - 2, . . . , 1 do 

12: a Pply IGT Z ik to reduce r^: i? = RZ ik ; 

13: update Z: Z = ZZ iik ; 

14: end for 

15: fc = fc + l; 

16: end if 

17: end while 



Suppose z n , z n -i, ■ ■ ■ , Zfc+i have been fixed, we try to 
determine z k at level fc by using (9). We first compute c k 
and then take z k = [c k ~\. If (9) holds, we move to level fc — 1 
to try to fix z k -\. If at level fc — 1, we cannot find any integer 
for z k -i such that (9) (with fc replaced by fc — 1) holds, we 
move back to level fc and take z k to be the next nearest integer 
to Cfe. If (9) holds for the chosen value of z k , we again move 
to level fc — 1; otherwise we move back to level fc + 1, and 
so on. Thus after z n , . . . , z k+ i are fixed, we try all possible 
values of z k in the following order until (9) dose not hold 
anymore and we move back to level fc + 1: 

|cfcl , Lc/cl ~ 1, Lcfel + 1, L c fcl - 2, ... , if Cfe < \ c k ~\ , ^ 
|cfcl , Lc/cl + 1, |cfcl - !, L c fel + 2, . . . , if c fe > |cfe] . 

When we reach level 1, we compute c\ and take z\ = |_ci~|. If 
(9) (with fc = 1) holds, an integer point, say z, is found. We 
update (3 by setting (3 = \\y — Rz\\ 2 and try to update z to find 
a better integer point in the new hyper-ellipsoid. Finally when 
we cannot find any new value for z n at level n such that the 
corresponding inequality holds, the search process stops and 
the latest found integer point is the optimal solution we seek. 

At the beginning of the search process, we set (3 — oo. The 
first integer point z found in the search process is referred to 
as the Babai integer point. 

IV. Partial LLL Reduction 
A. Effects of size reduction on search 

Ling and Howgrave-Graham [1] has argued geometrically 
that the performance of the Babai integer point is not affected 
by size reduction (see the first condition in (4)). This result 
can be extended. In fact we will prove algebraically that the 
search process is not affected by size reduction. 

We stated in Section II that the size-reduction condition in 
(4) is met by using IGTs. It will be sufficient if we can show 
that one IGT will not affect the search process. Suppose that 



two upper triangular matrices R e R" xn and R 6 R nxn have 
the relation: 



R = RZ st , Z st =I-(e s ei, s < t. 



Thus, 



rkt = r kt - (r kS7 if k < s, (11) 
rkj = r k j , if k > s or j ^ t. (12) 

Let z = Z~^z. Then the ILS problem (5) is equivalent to 



min \\y - Rz\\\. 



(13) 



For this ILS problem, the inequality the search process needs 
to check at level k is 

n 

level k: f\ k (z k - c k f < p - ^ f%{z 5 - Cjf , (14) 

j=k+i 

Now we look at the search process for the two equivalent ILS 
problems. 

Suppose z n , z n -i, . . . , z k+ i and z k+ i have 

been fixed. We consider the search process at level k under 
three different cases. 

• Case 1: k > s. Note that Rk- n ,k:n = Rk-.n,k-.n- It is 
easy to see that we must have c, = c, and Zj — Zi for 
i = n, n — 1, . . . , k + 1. Thus at level k, c k = c k and 
the search process takes an identical value for z k and z k . 
For the chosen value, the two inequalities (9) and (14) 
are identical. So both hold or fail at the same time. 

• Case 2: k = s. According to Case 1, we have Z, — Zi for 
i = n, n — 1, . . . , s + 1. Thus 

_ ir^n - - 

Vk — 2~,j= k +l r kjZj 



Cfc 



r kk 



Vk ~ E"=fc+i,jy t r kjZj - (r k t ~ (r kk )z t 



Tkk 



= c k + (z t , 



where ( and z t are integers. Note that z k and z k take on 
values according to (10). Thus values of z k and z k taken 
by the search process at level k must satisfy z k = z k +(z t . 
In other words, there exists one-to-one mapping between 
the values of z k and z k . For the chosen values of z k and 
z k , Zk — c k = z k — c k . Thus, again the two inequalities (9) 
and (14) are identical. Therefore both inequalities hold or 
fail at the same time. 

Case 3: k < s. According to Case 1 and Case 2, z$ = Zj 
for i = n, n — 1, . . . , s + 1 and z s = z s + (z t . Then for 
k = s - 1, 

Vk - 2^j=k+l r kjZj 



Ck 



Tkk 



_ Vk ~ 2-^j=k+2,jjtt T kj z j ~ r ks z s — T k t z t 



r kk 



Vk - E™=fc+i r kjZ 3 - (r ks z t + (r ks z t 



rkk 



Ck- 



Thus the search process takes an identical value for z k 
and z k when k = s — 1. By induction we can similarly 
show this is true for a general k < s. Thus, again the two 
inequalities (9) and (14) are identical. Therefore they hold 
or fail at the same time. 
In the above we have proved that the search process is 
identical for both ILS problems (5) and (13) (actually the two 
search trees have an identical structure). Thus the speed of the 
search process is not affected by the size-reduction condition 
in (4). For any two integer points z* and z* found in the 
search process at the same time for the two ILS problems, we 
have seen that z* — z* for i = n, . . . , s + 1, s — 1, . . . , 1 and 
z* = z* + ( z *, i.e., z* = Z~ t x z*. Then 



Rz*\\l 



\y Rz 



* l|2 
2- 



Thus, the performance of the Babai point is not affected by the 
size-reduction condition in (4) either, as what [1] has proved 
from a geometric perspective. 

However, the IGTs which reduce the super-diagonal entries 
of R are not useless when they are followed by permutations. 
Suppose i,i | > r '~ 1 2 , '~ 1 L If we apply Zi-ij to reduce 
ri-\^i, permute columns i — 1 and i of R and triangularize it, 
we have from (6) and (11) that 



< rt 



-l,i 



1 ' 



-1,4-1 



+ r z 



From (6) we observe that the IGT can make |rj_i ; j_i | smaller 
after permutation and triangularization. Correspondingly 
becomes larger, as it is easy to prove that ri-i^-ir^i remains 
unchanged after the above operations. 

The ELLL algorithm given in [1] is essentially identical to 
Algorithm 1 after lines 1 1-14, which reduce other off-diagonal 
entries of R, are removed. 

B. Numerical stability issue 

We have shown that in the LLL reduction, an IGT is useful 
only if it reduces a super-diagonal entry. Theoretically, all 
other IGTs will have no effect on the search process. But 
simply removing those IGTs can causes serious numerical 
stability problem even H is not ill conditioned. The main 
cause of the stability problem is that during the reduction 
process, some entries of R may grow significantly. For the 
following n x n upper triangular matrix 

^12 4 
12 
1 2 4 



H = 



1 2 



(15) 



when n ~ 100, the condition number K2(H) « 34. The LLL 
reduction will reduce H to an identity matrix J. However, if 
we apply the ELLL reduction, the maximum absolute value in 



R will be 2" _1 . When n is big enough, an integer overflow 
will occur. 

In the ELLL algorithm, the super-diagonal entries are al- 
ways reduced. But if a permutation does not occur immediately 
after the size reduction, then this size reduction is useless 
in theory and furthermore it may help the growth of the 
other off-diagonal entries in the same column. Therefore, for 
efficiency and numerical stability, we propose a new strategy 
of applying IGTs in Algorithm 1. First we compute ( = 
[rk-i,k/rk-i,k-i~\- Then we test if the following inequality 

5r l-iM-i > (r k -i,k - (r k -i,k-i) 2 + r 2 kk 

holds. If it does not, then the permutation of columns k — 1 and 
k will not occur, no IGT will be applied, and the algorithm 
moves to column k + 1. Otherwise, if ( ^ 0, the algorithm 
reduces r k -i,k and if £| > 2, the algorithm also reduces all 
for i = k — 2, k — 3, . . . , 1 for stability consideration. 
When | C| = 1, we did not notice any stability problem if we 
do not reduce the above size of r i:k for i = k — 2, k — 3, . . . , 1. 

C. Householder QR with minimum column pivoting 

In the original LLL reduction and the ELLL reduction, GSO 
is used to compute the QR factorization of H and to update R 
in the later steps. The cost of computing the QR factorization 
by GSO is 2n 3 flops, larger than 4n 3 /3 flops required by 
the QR factorization by Householder reflections (note that we 
do not need to form the Q factor explicitly in the reduction 
process); see, e.g., [10, Chap 5]. Thus we propose to compute 
the QR factorization by Householder reflections instead of 
GSO. 

Roughly speaking, the reduction would like to have small 
diagonal entries at the beginning and large diagonal entries at 
the end. In our new reduction algorithm, the IGTs are applied 
only when a permutation will occur. The less occurrences 
of permutations, the faster the new reduction algorithm runs. 
To reduce the occurrences of permutations in the reduction 
process, we propose to compute the QR factorization with 
minimum-column-pivoting: 



Algorithm 2 QR with minimum-column-pivoting 



Q 1 HP = R 



(16) 



where P e Z nxn is a permutation matrix. In the fc-th step 
of the QR factorization, we find the column in H k:n , k:n , 
say column j, which has the minimum 2-norm. Then we 
interchange columns k and j of H. After this we do what 
the fc-th step of a regular Householder QR factorization does. 
Algorithm 2 describes the process of the factorization. 

Note that the cost of computation of lj in the algorithm is 
negligible compared with the other cost. 

As Givens rotations have better numerical stability than 
GSO, in line 7 of Algorithm 1, we propose to use a Givens 
rotation to do triangularization. 

D. PLLL reduction algorithm 

Now we combine the strategies we proposed in the previous 
subsections and give a description of the reduction process 
in Algorithm 3, to be referred to as a partial LLL (PLLL) 
reduction algorithm. 



set R = H,P = I n ; 
compute l k = Hffcll!' k 
for k = 1, 2, . . . ,n do 



1 . . . , n; 



find j such that lj is the minimum among l kl ■ ■ . , l n \ 

exchange columns k and j of R, I and P; 

apply a Householder reflection Q k to eliminate 

fk+i.k, r k+ 2.k, ■ ■ ■ , r n .k', 



update lj 
end for 



by setting lj = lj - r k j , j = k + 1 



, m; 



Algorithm 3 PLLL reduction 



compute the Householder QR factorization with minimum 
pivoting: Q T HP = R, 
set Z = P, k = 2; 
while k < n do 



C = [rk-i,k/rk-i,k-i], a = (r k - 



i,k 



(r k - 



i,fe-i 



2. 



' fc-i,fe-i 



> (a 



k,k 



) then 



if C + then 

apply the IGT Z k -i tk to reduce r k -\. k \ 
if |C| > 2 then 

for i = k — 1, . . . , 1 do 

apply the IGT Z i-k to reduce r itk ; 
update Z: Z = ZZ itk ; 
end for 
end if 
end if 

permute and triangularize: R = G k ~i. k RP k - 

update Z: Z = ZP k -i tk ; 

k = k — 1, when k > 2; 
else 

k = k + l; 
end if 
end while 



i,fc, 



V. Numerical Experiments 

In this section we give numerical test results to compare 
efficiency and stability of LLL, ELLL and PLLL. Our sim- 
ulations were performed in MATLAB 7.8 on a PC running 
Linux. The parameter S in the reduction was set to be 3/4 in 
the experiments. Two types of matrices were tested. 

1) Type 1. The elements of H were drawn from an i.i.d. 
zero-mean, unit variance Gaussian distribution. 

2) Type 2. H = UDV T , where U and V are the Q- 
factors of the QR factorizations of random matrices 
and D is a diagonal matrix, whose first half diagonal 
entries follow an i.i.d. uniform distribution over 10 to 
100, and whose second half diagonal entries follow an 
i.i.d. uniform distribution over 0. 1 to 1 . So the condition 
number of H is bounded up by 1000. 

For matrices of Type 1, we gave 200 runs for each dimen- 
sion n. Figure 1 gives the average flops of the three reduction 
algorithms, and Figure 2 gives the average relative backward 
error ||H - Q C R C Z- X || 2 /||H || 2 , where Q c , R c and Z~ x 



x 10 





Fig. 1 . Matrices of Type 1 - Flops 



Fig. 3. Matrices of Type 2 - Flops 





Fig. 2. Matrices of Type 1 - Backward Errors 
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Fig. 4. Matrices of Type 2 - BER 



are the computed factors of the QRZ factorization produced 
by the reduction. From Figure 1 we see that PLLL is faster 
than both LLL and ELLL. From Figure 2 we observe that 
the relative backward error for both LLL and PLLL behaves 
like 0(nu), where u w 10~ 16 is the unit round off. Thus the 
two algorithms are numerically stable for these matrices. But 
ELLL is not numerically stable sometimes. 

For matrices of Type 2, Figure 3 displays the average flops 
of the three reduction algorithms over 200 runs for each 
dimension n. Again we see that PLLL is faster than both LLL 
and ELLL. 

To see how the reduction affects the performance of the 
Babai integer point, for Type 2 of matrices, we constructed 
the linear model y = Hx + v, where x is an integer vector 
randomly generated and v ~ Af(0, 0.2 2 J). Figure 4 shows the 
average bit error rate (BER) over 200 runs for each dimension 
n. Form the results we observe that the computed Babai points 
obtained by using LLL and PLLL performed perfectly, but the 
computed Babai points obtained by using ELLL performed 
badly when the dimension n is larger than 15. Our simulations 
showed that the computed ILS solutions obtained by using 
the three reduction algorithms behaved similarly. All these 
indicate that ELLL can give a very poor estimate of x due 
to its numerical stability problem. 
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